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Abstract. We study matchings on sparse random graphs by means of the cavity method. 
We first show how the method reproduces several known results about maximum and perfect 
matchings in regular and Erdos-Renyi random graphs. Our main new result is the computation 
of the entropy, i.e. the leading order of the logarithm of the number of solutions, of matchings 
with a given size. We derive both an algorithm to compute this entropy for an arbitrary graph 
with a girth that diverges in the large size limit, and an analytic result for the entropy in regular 
and Erdos-Renyi random graph ensembles. 



1. Introduction 

We study a classical problem of graph theory, namely the size and number of matchings on 
various types of random graphs. This problem has been intensively studied for a long time by 
mathematicians and computer scientists pp. Here we address it using some techniques developed 
in the statistical mechanics of spin glasses [3]. Such approaches have been used in recent years 
to describe successfully the typical cases of random combinatorial problems as e.g. the weighted 
matching (or assignment) P], the traveling salesman problem 0], the vertex cover on random 
graphs K-satisfiability 012], or the coloring of random graphs |H]. 

Here we apply the cavity method [HI to describe the matchings on ensembles of sparse random 
graphs with a given degree distribution. We work within the replica symmetric (RS) version of the 
cavity method, and we argue that it gives exact results for these problems. In fact we show how 
the method reproduces several known results about the size of the maximum matching (which is 
also the maximum number of self avoiding dimers) and the existence of the perfect matchings (the 
possibility of covering the graph with iV/2 dimers). This also confirms the previous result by Zhou 
and Ou-Yang JO] who also used the cavity method, but in a different way (we discuss below the 
differences of our approaches) . 

Our main new result is the computation of the entropy, i.e. the leading order of the logarithm 
of the number of solutions, of matchings with a given size in large sparse random graphs. We 
derive both an algorithm to compute this entropy for arbitrary graphs with a girth (the length of 
the shortest graph cycle) that diverges in the large size limit, and an analytic result for the entropy 
in regular and Erdos-Renyi random graph ensembles. 

The cavity method is not yet proved to be a rigorous tool, however it is widely believed -at 
least by physicists- to be exact, and in some cases its predictions have been confirmed rigorously. 
Let us mention the work of Talagrand jjl] who, using some of the tools developed by Guerra 
jl2j , proved the validity of the Parisi formula for the partition function of Sherrington-Kirkpatrick 
model (Parisi's original work |13| uses the replica method, but it can be reformulated in cavity terms 
12). Aldous 2^ developed the local weak convergence method and proved the (^(2)-limit for the 
random assignment problem, initially found in P). In this same problem, Bayati, Shah and Sharma 
[T^ proved the convergence of a "belief propagation" algorithm, which is basically the replica 
symmetric cavity method, for finding the lowest weight assignment in generic bipartite graphs. 
Recently, Bandyopadhyay and Gamarnik have used this local weak convergence strategy to 
derive some results on the entropies in the problems of graph coloring and independent sets, in 
regions of parameters where the RS cavity solution is the correct one. The local weak convergence 
method was used for weighted matchings in sparse random graphs also in jl7| . 
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Because of these recent developments, and of the simple replica symmetric nature of the 
matching problem, we believe that it should be possible to turn all our results into rigorous 
statements. We hope that this work will turn to be useful also in the opposite direction, i.e. that 
working on rigorous proofs of our results for matching will help to develop the rigorous version of 
the cavity method. 

The matching problem on a graph is equivalent to a physical model of dimers. This was 
mostly studied on planar graphs (lattices), where there is a beautiful method by Kasteleyn [TH] 
which shows how to count exactly dimer arrangements (perfect matchings). On non planar regular 
graphs a Bethe mean field approximation, which is known to be exact on Bethe lattice, has been 
developed in and references therein. Our work generalizes these results and gives the solution 
of dimer models on sparse random graphs. 

The paper is organized as follows. In sectionElwe set up our notations and overview the main 
known results for the matching on sparse random graphs. In section |3| we introduce the cavity 
approach to the matching problem and derive the size of maximum matching and the entropy of 
matchings of a given size on a typical random graph. We also describe approximate polynomial 
algorithms for sampling and counting matchings on a given graph. In section^lwe give results for 
the size and the number of matchings for the ensemble of regular and Erdos-Renyi random graphs, 
and we show that the replica symmetric ansatz is stable for these two ensembles. In section |S1 
we discuss the alternative IRSB solution at zero temperature which was obtained by Zhou and 
Ou-Yang |10|. The conclusion summarizes this work and gives some perspective on how it could 
be turned into rigorous results. 

2. Background and notations 

Consider a graph G{V, E) with N vertices (A^ = | V|) and a set of edges E. A matching of G is a 
subset of edges M C E such that each vertex is incident with at most one edge in M. In other 
words the edges in the matching M do not touch each other. The size of the matching, \M\, is the 
number of edges in M. We denote the size of maximum possible matching by \M*\. The trivial 
relation |Af*| < N/2 follows from the definition. If a maximum matching covers all the vertices, 
I A/* I N/2, we call M* a perfect matching. 

Finding a maximum matching in a given graph G is a polynomial problem. For instance the 
algorithm of [211] solves this problem with a computational complexity proportional to 0(|i?| -^/IT/I). 

How many matchings of size |M*| can we actually find in G? No exact polynomial algorithm 
to answer this question is known. Counting the number of matchings of a given size was proven 
PT] to belong to the #P— complete (sharp P-complete) class of problems. It means that if an 
exact polynomial algorithm for this problem existed we could also count solutions of all the other 
problems belonging to the NP class. It is generally believed that a polynomial procedure to solve 
#P— complete problems does not exist. For this reason it is very useful to develop methods to 
count matchings fast (in polynomial time) but only approximately. Several works have been done 
in this direction [^1^1^ . 

In this paper we study not only properties of matchings on a given graph G but also on 
ensembles G of large sparse random graphs. When we claim that a property A is true for a typical 
random graph G € G we mean: when G is chosen from the ensemble with its natural probability 
law, the probability that A is true goes to one as the size of G grows to infinity. 

2.1. Rigorous results for matching on random graphs 

In this section we give some known rigorous results for matchings on random graphs. From the 
point of view of matching the simplest ensemble is the one of r-regular random graphs, i.e. all 
graphs where every vertex has degree (number of neighbors) r. In this ensemble the measure is 
uniform over all r-regular graphs. 

Theorem 1 (Bollobas and McKay'86 [25 ): If r > 3 and the number of vertices N is even 
then almost every r-regular graph has a perfect matching. Denote by JVg the number of perfect 
matchings of r-regular graph G. Then its first two moments are 

E(AAg) ^V2e'/^[ir-iy-'/r^-Y^^, (1) 
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m^G?] « ^ ^e-(2'-i)/4(-i)^ w^iNaf. (2) 

In the statistical physics language we call the logarithm of the first moment log [E(A/'g)] the 
annealed entropy of perfect matchings and the typical average E[log A/q] X the quenched entropy of 
perfect matchings. Due to the concavity of logarithmic function the upper bound for the quenched 
entropy follows from (Q) 

E[logAAG] <log[E(AAG)] =A^[(r-l)log(r-l)-(r-2)logr]/2 + 0(l). (3) 

In section [4. II we will show that for r- regular graphs the quenched entropy is in fact the same as 
the annealed one, i.e. in ® the bound is tight. Note that the fact that ¥.[[NgY] ^ ^Wg)Y 
to leading exponential order is not enough to prove that the quenched and annealed entropy are 
equal. 

In this paper we will be interested in random graphs with a fixed degree distribution: we 
call Q{k) the probability that a randomly chosen vertex has degree fc, in the asymptotic limit of 
large graphs. In particular, in Erdos-Renyi (ER) random graphs, where every edge is present with 
probability p — c/{N — 1), the degree sequence is Poissonian Q{k) ~ e~'^c^/k\. Because of the 
existence of a fraction of isolated vertices perfect matchings almost surely do not exist in ER 
graphs. The size of maximum possible matching was computed in a seminal paper of Karp and 
Sipser 

Theorem 2 (Karp and Sipser'81): The maximum matching in an Erdos-Renyi random 
graph with N sites and mean degree c has on average size 

E(|M*|) ^~Pi(^)+P2(c)-cpi(c)+cpi(c)p2(c) ^^^ 

where Pi (c) is the smallest solution of equation^ — exp [— cexp (— cp)] andp2(c) = 1 — exp [— cpi(c)]. 

When c < e there is only one solution for pi(c). When c > e another pair of solutions for 
Pi(c) appears. 



2.2. Karp-Sipser leaf removal, the core 

The Karp-Sipser theorem was originally proven by analyzing a greedy leaf removal algorithm |26| . 
This algorithm consists of two steps 

(1) Given a graph G, if there are leaves choose randomly one of them i and its incident edge {ij). 
Put this edge to the matching and remove the two vertices i and j. Delete at the same time 
all the edges incident with j. Repeat until there are no leaves. 

(2) If there are no leaves in G choose randomly an edge (ij) with uniform probability, add it to 
the matching and erase all the edges incident with i and j. Go to step (1). 

We define as a core of the graph all the non-single vertices (and edges between them) which 
remain in the graph after the first step of the leaf removal procedure. The core does not depend on 
the history of the leaf removal [37|. Karp and Sipser proved that for c < e the core is small (zero 
asymptotically) whereas for c > e the core covers a finite fraction of all the vertices. They proved 
also that when a large (or order N) core exists, asymptotically all its nodes can be matched. 

We call v{c) the fraction of vertices in the core of a typical ER random graph of average degree 
c, l{c)N the number of edges in the core, and m{c)N the number of edges matched in the leaf 
removal procedure. It is known |27| that 

c c 
v{c) ^ Psil ~ cpi) , 1{c)^-pI, m{c)^p2--pl, (5) 

where pi, p2 are the same parameters as in the Theorem 2 of Karp and Sipser, and p3 = 1 —pi — p2- 
Properties of the core were also studied in and . From these results it follows that the 
degree distribution in the core is Poissonian-like 

Q(0) = Q(l)-0, Q(fc) = ^--^p^ for fc>l, (6) 

where C is a normalization constant. We will study connections between the Karp-Sipser leaf 
removal and our method in section ni4.1l 

X We should write E[log (A/q + 1)] for the quenched entropy to avoid — oo for graphs which do not have any perfect 
matching. 
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2.3. The annealed average 

We denote by Ng{x) the number of matchings of size \M\ — xN/2 in a graph G. Its expectation 
E[A/G(a;)] in the random r-regular graph ensemble can be computed as the number of all possible 
matchings of size \M\ times the leading order of the number of all r-regular graphs which contain 
a given matching M, divided by the leading order of the number of all r-regular graphs. We keep 
in mind that r is finite and N oo. This gives the annealed entropy: 

logE[AAG(x)] 



N 



7; ) ^ogr- 



■ log (r — x) — (1 — x) log (1 



a;)--loga;.(7) 



Again, thanks to the concavity of logarithm, the quenched entropy cannot be larger than the 
annealed one E[log {Af)]/N < \ogE{Af) / N . We will see in section 1^1 that for r-regular graphs this 
upper bound is actually tight. 

In the ensemble of ER random graphs the expectation of the number of matchings of size 
|M| = xN/2 is computed in the very same way and reads 



E[J\fG{x)] « exp 



Nx 



In- 



1 - 2 



1 



- 1 ln(l - x) 



(8) 



If the exponent is negative (which happens for c < e and x sufficiently large) then there is almost 
surely no matching of size \M\ in graph G. On the other hand, if the exponent is positive then 
eq. provides us with an upper bound on the quenched (typical) entropy 



E{\og[J^G{x)]} ^ logE(AA) 



N 



N 



In - - 1 - 2 I i - 1 ) In (1 - x) 



(9) 



For ER random graphs the bound is not tight. From eq. ^ we see that for c > e the average 
number of perfect matchings {x = 1) is exponentially large. But we know that for a typical 
ER graph no perfect matching exists (due to the presence of isolated vertices). The reason is 
that E[A/'G(a^)] is dominated by few exceptional graphs G which have a huge amount of perfect 
matchings. The correct quenched average E{log [N'g{x)]}/N will be computed in section IT!^ 



3. Cavity method: general formalism 

3.1. Statistical physics description 

We describe a matching by the variables Si = si^ab) G {0, 1} assigned to each edge i = (ab) of G, 
with Si — I if i £ M and = otherwise. The hard constraints that two edges in a matching 
cannot touch impose that, on each vertex a gV: (ah)eE'^(afc) — ^- complete our statistical 
physics description, we define for each given graph G an energy (or cost) function which gives, for 
each matching M = {s}, the number of unmatched vertices: 

EoiM ^ {s}) = Eaiis}) = N- 2\M\ , (10) 

a 

where Ea — I — ^{ab)- The Bohzmann probabihty law in the space of matchings is defined by: 
Pg{M) = ^^e-f'^o(M) (11) 

where /3 is the inverse temperature and Zg{P) is the partition function. 




Figure 1. On the left, example of a graph with six nodes and six edges. On the right, the 
corresponding factor graph with six functional nodes (squares) and six variable nodes (circles). 
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We use a factor graph representation jS^ of the Boltzmann probabiHty To a graph G 

we associate a factor graph T{G) as follows (see fig. ^| : To each edge of G corresponds a variable 
nodes (circle) in !F{G); to each vertex of G corresponds a function node (square) in J^{G). We 
shall index the variable nodes by indices i, j, k, . . . and function nodes by a, b,c, . . .. The variable i 
takes value = 1 if the corresponding edge is in the matching, and = if it is not. For a given 
configuration s = {si, . . . , S|_e|}, the weight of function node a is 



V'a(s) =1 y < 1 e-^(i-S'^-w , (12) 




where V{a) is the set of all the variable nodes which are neighbours of function node a, and the 
total Boltzmann weight of a configuration is Zg{i3) Ha i^a{s)- Later on, when confusion cannot be 
made, we denote V{a) just as a. 

We want to compute the internal energy Eg{0) (the expectation value of the number of 
unmatched vertices) and the entropy So {(3) (the logarithm of the number of matchings). For 
/3 — !■ cx) (zero temperature limit) these two quantities give the ground state properties, i.e. 
respectively the size and entropy of the maximum size matchings. 

We are interested in the "thermodynamic" limit of large graphs {N oo), and we shall 
compute expectations over ensembles of graphs of the densities of thermodynamical potentials 
e{P) = E[EGiP)]/N and s(/3) = E[S'g(/3)]/A^, as wefi as the average free energy density 

/(/?) = ^niogZcW] = ^EiFaim = e{f3) - ^sif3) . (13) 

The reason for this interest is that one expects, for reasonable graph ensembles, Fg{P) to be self- 
averaging. This means that the distribution of Fq{(3)/N becomes more and more sharply peaked 
around /(/3) when N increases. 



3. 2. The cavity method at finite temperature 




Figure 2. Part of the factor graph used to compute PJ"^". 

In the following we use the cavity method at the replica symmetric (RS) level, as it is described 
in in). We introduce a "cavity" in the factor graph by deleting the function node a and its incident 
edges, and we denote by P]p" the probability that variable i takes value Si (see fig. EJ. Because 
of the local tree-like structure of the (sparse) graphs that we study, it is reasonable to assume that 
the P^~^^ for j G 6 — j are uncorrelated. This is the main assumption of the cavity method at the 
RS level (see |H]) and we will check later on its self-consistency. Using this assumption one gets 

Psr" = E M + E ^ 1 ) e-'^^i-^-S Jl pj-^''^ (14) 

{sj} \ j&b-i j jeb-i 

where C*^" is a normalization constant. 

For every edge between a variable i and a function node a, we define a cavity field as 
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The recursion relation between cavity fields is then: 



log 



jeb-i 



(16) 



This is one form of the "belief propagation" equations [30112] ■ The cavity fields can be interpreted 
as messages living on the edges of the factor graph, with some consistency rules on the function 
nodes, and one can try to solve them by an iterative "message passing" procedure. 

Assuming that one has found the cavity fields, one can deduce from them the various marginal 
probabilities and the free energy. For instance the expectation value (with respect to Boltzmann's 
distribution) of the occupation number Si of a given edge i = {ab) is equal to 

^ (17) 



1 



To compute the free energy we first define the free energy shift AFg^^ifzyi^a) after addition of 
a function node a and all the edges i around it, and the free energy shift AFi after addition of an 
edge i — (ab). These are given by: 



The total free energy is then P \',V2\: 



(18) 



(19) 



(20) 



This form of free energy is variational, i.e. the derivative ^^g^pff -'^ vanishes if and only if the fields 
satisfy H16|l . This allows to compute easily the internal energy 



E 



1 _^ g/3(h'-» + hi-'') 



(21) 



The second and third equalities have been derived using eq. p6ll . In the last term we can recognize 
the probability that a node a is not matched. The entropy is then obtained as 

Sam = PIEgW - Fam . (22) 

All the equations (|14|I - H22(I hold on a single large sparse graph G. In section \TM we will 
describe how to use them to build algorithms for counting and sampling the matchings on a given 
graph. 

We now study the typical instances in an ensemble of graphs. We denote the average over 
the ensemble by E(-). We assume that the random graph ensemble is given by a prescribed degree 
distribution Q(fc). Let us call Vfsih) the distribution of cavity fields over all the edges of a large 
typical graph from the graph ensemble. It satisfies the following self-consistent equation 

oo k—1 r / ^ 



fc=i 



(23) 



The term kQ{k)/c is the normalized degree distribution of the function node a when one picks up 
uniformly at random an edge a — i from the factor graph; c = kQ{k) is the mean degree. This 
equation for distribution Vf}{h) can be solved numerically by a technique of population dynamics 

The average of the free energy density is then 



N 



iflQik) fl[[dh^V,m] log + 

k=0 •' i=l \ i / 



+ ^ / dh'dh'Vp{h')rp{h') log(l + e'^('''+''^)). 



(24) 
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This expression for the free energy is in its variational form (see [HI), i.e. the functional derivative 
^^^^'^ vanishes if and only if Vp satisfies (|23|l . The average energy density is then equal to: 



OO 

fc=0 



Q{k) / n WVp{K^) 



dh^dh^Vi3{h^)Vp{h^) 



Mh^+h-i) 



The average entropy density is 

s(/3) = mp) - fm 



(25) 



(26) 



All our computations up to now rely on the only assumption (the 'RS cavity assumption') 
that the neighbors of a node in a cavity are uncorrelated. A necessary condition for the validity of 
this assumption is that the following nonlinear (spin-glass) susceptibility be finite KM] : 



XSG 



E((sos,) 



d=0 



a''E((soSd) 



(27) 



Here {soSi)c is the connected correlation function between reference edge and edge i, a'^ is 
the average number of vertices at distance d from the edge 0, for general degree distribution 
a — J^kLo ^(^ + l)Q(fc + l)/c- The susceptibility is finite if and only if 



At = lim a [E{{soSd] 



2\-\ d 



d — >oo 



< 1. 



(28) 



We will call the finite temperature stability parameter. A necessary condition for the RS cavity 
assumption to hold is that At < 1- 





Figure 3. Chain of the cavity fields used to compute the finite temperature stability parameter. 



Using the fluctuation-dissipation relation when edge i is at distance d from edge we have for 
the correlation function 

d 



E((soSrf)2) =CE 



dhd 
dho 



CE 



n 

.4=1 



/ dh. 



(29) 



where C is a d-independent constant. The field hi is according to (|15|l a function of hi^i and other 
fields /iIj*]^ incoming to hi, see fig. O 

Pi-i 



fc=i 



(30) 
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The number pi-i of the incoming fields is chosen according to the probabihty distribution 
Q2(Pi-i) of the number of neighbors of a node given this node has akeady two other neighbors, 
Q2{k) = (fc + 2)(fc + l)Q(fc + 2)/(ac); in particular for Poisson distributions Q2 = Q- The values 
of the fields hf\ are chosen randomly from the distribution H23I) . 



3.3. Zero temperature limit 

The zero temperature limit {[3 —> 00) corresponds to the ground state (maximum matching) of our 
system. Let us investigate the explicit behavior of that limit. 

Our numerical studies of 123|l show that for large (3 the cavity field distribution V 13(h) peaks 
around three different values h E {0,±1}. 

Vih) = pi6{h - 1) + p26{h + 1) + psSih) , (31) 

where pi, p2 and p^ are the weights (probabilities) of /i = 1, — 1 and 0. The cavity fields update 
becomes 

h'^" = - max (-1, h^^'') . (32) 

jeb-i 

These equations may also be derived by working directly at zero temperature as in |35j . We 
define the cavity energy as the ground state energy of subgraph containing edge i when 

constraint a is absent (fig. [2)1 and edge i takes value s;. The analog of H14fl is 



El-*" = min I I + V s,- < 1 



jeb-i 



j^b—i jGb—i 



(33) 



If one defines the cavity fields as 

then H33|l gives back the cavity fields update 1)32(1 . The difference between cavity energies when i 
is (is not) matched may be ±1 or and those are the three possible values of cavity fields. 
Eq. H23() . taken in the /3 — > 00 limit, shows that 

00 

Pi = -5](fc + l)Q(fc + l)p^-, (35) 

fc=0 



P2 = - E(^ + 1)2(^ + 1)[1 - (1 - Pi)'] , (36) 

fc=0 



P3 = -Y.{k + l)Q{k + 1)[(1 - pi)^ - p^] . (37) 

fc=0 

The possible solutions to these equations depend on the distribution Q{k). 

(a) There always exists a solution with p^ — 0, pi = I — p2. 

(b) For graphs without leaves, Q(l) = 0, there exists a solution with p^ = 1, pi = P2 = 0. 

(c) For graphs with leaves Q(l) > a solution with < pi,p2,P3 < 1 exists if the mean degree is 
sufficiently large. 

Let us stress at this point that our numerical solution of 1)23(1 for the cavity fields distribution at 
very small but nonzero temperatures corresponds to case ps > 0, (b) or (c). In other word whereas 
at zero temperature there exist two mathematically possible solutions of 1(35(1 - 1(371) . at arbitrary 
small temperature only the one with p^ > exists. In the rest of this section we describe this 
"small temperature" solution. Case (a), which exists only at strictly zero temperature, and which 
forbids the cavity fields h = 0, will be discussed in sectional 

Using ((24(1 the ground state energy, related to the size of the maximum matching, is 

00 

eo = Q(0) + ^ Q(fc)[p^- + (1 -pi)''^ - 1] + cpi(l -P2) . (38) 
fc=i 

If we consider solution (b) for pi, p2, ps for graphs with no leaves, Q(l) = 0, then the ground state 
energy is eo — Q(0), i.e. asymptotically all the non-isolated vertices are matched. In other words. 
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in an ensemble of random graphs with minimal degree 2 (e.g. regular graphs) almost every graph 
has an almost perfect matching. This is in agreement with the result of Karp and Sipser and 
also with a stronger result of Frieze and Pittel , who also found the (small) number of vertices 
which cannot be matched. 

To compute the average ground state entropy we need to expand the free energy at low 
temperatures /(/3 oo) = cq — so//9 + 0(l//3^). This requires to study the "evanescent" parts of 
the cavity fields ^j, i.e. the leading corrections to their value at /3 = oo. Numerically we have 
observed that at /? 1 the three delta peaks (|^ keep their weights (pi,P2,P3) and spread as 

log V 

/i = 1 H for the peak around /i = 1 , (39) 

/i = — 1 H ^ for the peak around h — —1 , (40) 

P 

log 7 

h = — — for the peak around h = Q . (41) 
From H23() we derive self-consistently the distributions A of the evanescent cavity fields /i, 7 

Ai{v) - f;Ci(fc) / n W^A2{^^^)] 6 L - \ , (42) 

fe=0 t=l ^ -L + Z^jM^:/ 

A2{fi) = EC2(fc) /n [d'^^M'^^)] S - ^) , (43) 



k=l 1=1 
oo ^ k 



^3(7) = E^3(^) /n i^^^Ml^)] S (1 - ^) , (44) 

where the combinatorial factors Ci,C2,C3 are given by Ci(fc) = (fc+i)P2 Qi^+i) ^ (^^(A:) — 

iY.-.k ''''''X%:T"''- ^ ^3(fc) ^ #E-=. ^"'yj%ir^^ Using'eqs. ®-63wc 
expand the free energy to order 1 / (3 and get the ground state entropy of maximum matchings 



^0 = E Tl E _ iA\ - ^PiP3 log7 - ^ log (1 + 7172) 



fc! ^ (m-fc)! 

fe=l m=k ^ ' 



CPl(Pl+P3)logI^ + E"w E 



k\ ^-^ (m — k)\ 

k=l m=k ^ ' 




+ E 2(fc)P2 log- 1 + E ~ cpi?'2 log (1 + ^^^) > (45) 

fc=0 \ i=l I 

where the overlines denote expectations over independent random variables with distribution A\ 
(for z/- variables) , Ai (for /i- variables) , ^3 (for 7- variables) . 

To conclude this section we describe the zero temperature version of the stability analysis for 
the cavity assumption. What follows is equivalent to the stability analysis of the replica symmetric 
assumption with respect to replica symmetry breaking for discrete sets of cavity fields j38| . Here 
we will describe this stability analysis in an intuitive way as a spread of changes in the cavity fields 
update that is analogous to what is called hug 'proliferation in the context of the stability of the 
one step replica symmetry breaking ansatz '33' "37, . 

Consider a node h with fc + 1 neighbors, fig. [3 Choose one incoming cavity field and one 
outgoing field Now consider probability Pkioio 7o|ai —> 7^) that the value of the outgoing 

field change from ao S {il, 0} to 70 € {il, 0} providing the incoming one has been changed from 
tti € {±1, 0} to 7i G {±1, 0}. More precisely P is probability of having a set of other fc— 1 incoming 
fields such that it causes the change ao 7o given that the change a; 7^ happened. There are 
eight different combinations of cavity fields such that Pkipco ~^ ^o\oLi ^C) \& nonzero 

Pfc(l ^ -1| - 1 ^ 1) = Ffe(-1 ^ 1|1 ^ -\)=p\-\ (46) 
Pfe(1^0|-1^0) = Pfe(0-^l|0^-l) =p^-\ (47) 
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Pfe(-1 ^ 0|1 ^ 0) = Pfc(0 ^ -1|0 ^ 1) 

Pki-1 ^ 0|1 ^ -1) = Pfc(0 ^ -1| - 1 

In the first step we change a cavity field from ai to 7^. 
that follows is 



10 

= b2+P3)'-\ (48) 
-l) = iP2+P3)''-' -pf- (49) 
Average probability of change ao to 70 



7^) = $]Q2(fc)Pfc+i(ao 

fc=0 



The most important change is given by the largest eigenvalue A 
with (EZIl 

we are interested in stability parameter Xq — aAmax 



lo\ai^ 7i)- (50) 
of the matrix P. In analogy 



Ao 



00 

k=0 



\ 



P3j 



(51) 



fc=0 



If Ao > 1 the total number of changes after many steps will diverge and we cannot hope the cavity 
assumption to be valid. On the other hand if Aq < 1 then the first change will not spread very far 
and the RS assumption is locally stable. 

Note also that Ao and At^o £^re not equal, because they count different quantities. But we 
expect that their position relative ti the threshold value 1 is the same. In other words both of 
them correctly describe the stability at zero temperature. The advantage of Aq is that it is far 
more easy to compute than the d ^ 00 limit of At.cI H28|) -()29 |) . 



3.4- Algorithms following from the cavity method 

3.4.1. Zero temperature message passing and leaf removal The zero temperature limit of the cavity 
fields update (|32|l can be seen as a message passing (warning propagation) algorithm. Interpretation 
of the three different cavity fields is: h — \ means "/ want you to match me", h — —1 means "/ 
want you not to match me", h — means "No preferences, do what you want". The interpretation 
of the cavity fields update (p?^ is: 

• If one or more of my neighbors wants me to match them, I match one of them, and I send: do 
not match me. 

• If none of my neighbors wants me to match it, and at least one of them does not have any 
preferences I send: no preferences, do what you want. 

• If all of my neighbors are saying do not match me, or if I have no neighbors, I send: match 
me. 

This message passing procedure starting from all the /i = is equivalent to the step (1) of the 
Karp and Sipser's leaf removal procedure in the following sense: Run the message passing until 
you find a fixed point. Then the edges where a message = 1 or /i = — 1 is sent from at least one 
side are exactly those edges which have been matched or removed in the leaf removal procedure. 
Consequently the edges in the core are those which have sent from both sides. Using eq. ((TB|l at 
a very small temperature we can use arbitrary initial conditions. 

3.4.2. Uniform sampling Solving the BP equations H16|l on a given graph G by iterations allows 
to sample typical matchings from Boltzmann's distribution (lll|) at inverse temperature /3 (i.e. 
matchings of size [1 — £'g(/3)]/2). 

Such a sampling can be done as follows: one chooses a variable node i, computes {si) from 
(|17() . and generates the value of Si as = 1 with probability {si) , and Si = with probability 
1 — {si). Once Si has been fixed, this imposes that all the fields ft,'^" (for all function nodes a 
connected to i) are equal either to +00 (if Si = 1) or to —00 (if = 0). One runs again the BP 
equations, with these extra constraints, and iterate this procedure. 
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3.4.3. Counting matchings on a single graph Our results may be also used to estimate the size 
l|25|l and number of matchings H26|l on arbitrary spare large graph G. The size of the maximum 
matching is obtained from the zero temperature limit of H20() or H21I) . this is not very interesting 
since the solution to this problem is well known, e.g. |20j . An algorithm to compute approximately 
the number of matchings of given size is more interesting. 

INPUT: The graph, the inverse temperature j3 , a maximum number of iterations tmax ■ 
OUTPUT: The entropy of matchings Sg = logA/c of size (1 — Eq)/2. If at the end 
Eq = —1 the procedure failed to converge. 

1. Initialize all the cavity fields /i'^" to some random value. 

2. Iterate belief propagation equations ( 1161) until they converge, i.e. the values 

of the cavity fields do not change anymore, or until the number of iterations exceeds 

tmax • 

3. Eq = —1. If the number of steps is > tmax STOP. Else: compute the energy 
Eg and the free energy Fq of matchings corresponding to /3 according to ( 1211) and 
(I20D , compute the entropy Sq = P[Eg — Fq] . 

In order to compute the total number of matchings one needs to take the /3 — > limit. To 
compute the number of maximal matchings one takes P ~* 00. In both cases the algorithm can be 
rewritten and simplified. 

Note also that the complexity of this algorithm is only linear in number of edges. The 
convergence and correctness in the highest order {{log AfG)/N) for large sparse graphs or trees 
is expected for the same reasons as the correctness of results of cavity method for the ensembles 
of r-regular and ER random graphs. 

On small or loopy graphs the cavity fields update does not have to converge or its fixed point 
may depend on the initial conditions. However, it would be interesting to apply it to "real world" 
graphs as in j^OI 1 or to compare the results with those of existing methods j52 |2S1 IM] • 



4. Application to random graph ensembles 

We compute and discuss the results for two random graph ensembles, the r-regular and ER graphs. 



4-.1. Random regular graphs 



For r-regular regular graphs (Q(fc) ~ Skr) the results are particularly simple. All the vertices are 
equivalent in the cavity method. It means that the solution of (|28|l is V 13(h) = 6{h — hr), where 
hr is the solution of 



hr — 



log [e-*^ + (r - l)e^'''-] 



given by: 



hr 



log 



= -2/3 . 



2(r-l) 



The free energy density H24|) simplifies to 
1 



P 



log [e^^ + re^''" 



The energy density l|25|l reads 

e-P - rhre^^- 



' 2P 
rhre^f""- 



log[l 



(52) 



(53) 



(54) 



(55) 



The entropy, related to the number of matchings, is computed using With a bit of algebra 

one finds that the quenched entropy as a function of energy is equal to the annealed result of eq. 
Q, where e = 1 — x. This result is compatible with, but slightly stronger than the Theorem 1 of 
Bollobas and McKay 
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The matching problem is equivalent to the physical problem of dimers on a graph. There it 
is natural to compute the density of dimers rp as a function of (3 (which is half of the chemical 
potential in the context of dimers). For r-regular graphs we find that this "equation of state" is 

This result has already been obtained in '19' for the dimer problem on Bethe lattice. 

For r > 2 in the zero temperature limit one has = Q, which corresponds to the solution (b) 
of H35|l - (|37|l . The ground state energy density is then eq — 0, this means that asymptotically almost 
all the vertices may be matched for almost every r-regular graph. This agrees with the stronger 
mathematical result that for r > 3 there exists a perfect matching almost surely in graphs with 
even number of vertices. From H45|l one finds that the ground state entropy density lim^_>oo s{(3) 
is 

so = [{r - 1) log (r - 1) - (r - 2) logr]/2 , (57) 

in agreement with the annealed result (O of BoUobas and McKay. 
The stability parameter At for r-regular graphs is 



AT = (r-l)('§i ) =(r-l) 

III —fiQ — hj 



-, 4 

□ -2/3 _ 



2(r-l) 



(58) 



hi=ho=hr 



is at 



We see that At < 1 for all finite temperatures and r > 2. In the zero temperature limit 
At^o = 1/(^^1)- The zero temperature stability parameter H51(l for r-regular graphs is Aq = 
for r > 2, and Aq = 1 for r = 2. This agrees qualitatively with the behavior of At-»o- It is worth 

noticing that also the ferromagnetic stability parameter, defined as (r — 1) ^§7^^ 

finite temperature smaller than 1 when r > 2. It should be possible to use this result in order 
to show that the matching properties on the root of a large tree are completely independent of 
boundary conditions, which could then allow for a rigorous proof of our results following the lines 
of Bandyopadhyay and Gamarnik |il6j . 

In order to exclude the possibility of a discontinuous transition towards a phase with broken 
replica symmetry, which we cannot see by analyzing the stability, we wrote the IRSB equations 
for the r-regular graph. We have seen clearly numerically that their solution reduces to the replica 
symmetric one. So all the evidence suggests that the RS cavity assumption should be valid for 
r-regular graphs and so we expect our result for the quenched entropy to be exact. 



4-2. ER graphs 

For the ER random graphs, with degree distribution Q{k) = e^'^c^ /kl, we have solved numerically 
the equation (|^ by the population dynamics method. Using we have then computed 

the energy density e(/3) (related to the size of the matching as \M\ = (1 — e)N/2) and the entropy 
density s(/3) for values of /3 € {—00, 00). In fig. ^we show the entropy versus the size of matching 
for mean degrees c = 1,2,3 and 6. 

The maxima of the curves in fig. 01 give the entropy of all the possible matchings, regardless 
of their sizes. The lower right ends of the curves gives the ground state energy (the size of the 
maximal matching) and the ground state entropy (number of maximal matchings). They are 
computed using by the following direct zero temperature method. 

The zero temperature equations (|35|l - (|37|) for the Poissonian distribution become 

p,^e~<^-P'\ P2 = l-e--P\ P3^l-Pi-P2. (59) 

Analyzing these equations we can see that for c < e there exists only the solution (a), with 
Pi + P2 = 1 and P3 = 0. For c > e there exists a second solution (c) with pi + P2 < 1 and 
1 > P3 > 0. From the population dynamics solution of eq. (|23|l at very small temperatures for 
c > e we found that the solution (c) with pi + P2 < 1 is the proper zero temperature limit for 
c > e. 

The ground state energy then reads 

\M*\ , , 

£0 = 1- 2— -P2 +Pl+ cpi - CP1P2 ■ (60) 
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Figure 4. Entropy density s{m) = \ogJ\f{m) / N as a function of relative size of the matching 
m = \M\/N = (1 — e)/2N for ER random graphs with mean degrees c = 1,2,3,6. The lower 
curve is the ground state entropy density so(eo) for all mean degrees. The curves are obtained 
by solving eqs. 1231 - 1^^1 with a population dynamics, using a population of sizes TV = 2 ■ 10'' to 
2 ■ 10^ and the number of iterations tm = 10000. 



This is the exact resuh of Karp and Sipser [21], Theorem 2. 

The ground state entropy for ER graphs is computed using population dynamics equations 
(|42|) - (I44II with combinatorial factors 

^ ' k\ ' (i_e-cpi)fc!' ■^^ ' (l-e-'=P3)fc! ^ ' 

Factors Ci{k) are properly normalized Poissonian distribution with mean equal to concentration of 
corresponding cavity fields. The ground state entropy (|45|l finally simplifies to 



So = - (1 + cpi)p3 log7 Y ^ '^1^2) 



- p2 logM - pi{l + cpi + cps) logv- CP1P2 log (1 + nu) . (62) 
We call the first two terms in eq. H62(l the core entropy Sc, the averages (denoted by overlines) 
are over the distribution H44|) . The rest (last three terms) we call the non-core entropy Snc, the 
averages are over the distributions (|42|l and l|43|l . The reason for these names is the following. 
Since we know the size and degree distribution on the core © we can use eq. (|45|) directly only 
for the core and indeed we will obtain the first two terms of eq. 1)62(1 , the core entropy. The rest 
is the entropy corresponding to the choice of the matching in the non-core part of the graph. 

Fig. El shows the core and non-core entropies of the maximum matchings and their sum as 
a function of the average degree c. The fourth (upper) line in fig. \E\ is the total entropy of all 
matchings. 

The finite temperature stability parameter l|28|l for ER random graphs is 

XT^c[E{{mnd)l)]^ . (63) 
We have to compute it numerically as described on fig. Oand eqs. I|28|) - (|29|) . As for the r-regular 
graphs we find that At grows as temperature decreases, see on the left on fig. |H1 So we may 
analyze only the zero temperature limit, and if that one is stable, then also the finite temperature 
is stable. On the right on fig. Elwe can also see the dependence of At on the distance d. Although 
we are not able to compute precisely its ci ^ 00 limit, all the evidences speak for the fact that 
even for c? — > 00 the stability parameter At never exceeds 1 . 

To check this, we look directly at the zero temperature stability parameter Aq H51|I which for 
ER graph reads 

Ao = cpi./l + ^. (64) 

V Pi 
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Figure 5. The ground state entropy density so (giving leading exponential behaviour of the 
number of maximum matchings) and the full entropy Sm (giving leading exponential behaviour 
of the number of all possible matchings) as a function of mean degree c in ER random graphs. 
The detail is in inset. The ground state entropy is the sum of Sc, the contribution of the core, 
and Snc, the contribution or the parts of graph removed in the leaf removal procedure. We see 
that Sc > only for c > e, because the core covers a finite fraction of vertices only when c > e. 
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Figure 6. On the left the finite temperature stability parameter Ay 1631 for distance d = 10 as 
a function of mean degree. The upper curve corresponds to the smallest temperature (/3 = 50). 
Data were obtained for size of population A'^ = 40000 and time t = 40000. On the right the 
dependence of Ay for temperature /3 = 50 on the mean degree and distance d. We can see 
that At is growing slightly with d in the regime c < e. For larger d we would need very big 
populations go obtain reliable data. The continuous line depict the zero temperature stability 
parameter Aq, eq. 1641 . 



Its value is also depicted in fig. El We can see that Aq < 1 (stable) for all mean degrees except 
c = e where Ao = 1 (marginally stable). Supported by the numerical data in fig. El we expect that 
in the d oo limit the At would behave in the qualitatively same way. 

From this analysis it is reasonable to conjecture that in ER random graphs the replica 
symmetric cavity assumption is correct and all our results, in particular for the entropy, are exact. 
Another strong argument in favour of the validity of replica symmetry at any finite temperature 
will be given in sectional 
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The size of the maximum matchings in ER graphs was studied recently by Zhou and Ou-Yang (Z- 
O) jn|j using the cavity method directly at zero temperature with a one step RSB solution. 
In this section we discuss the difference between their approach and ours, in particular as far as 
RSB effects are concerned. We keep to ER random graphs. 

One should first emphasize that both approaches give the same result for the size of the largest 
matching in ER graphs, and this result also agrees with the rigorous value of Karp and Sipser. 
Our formalism is more general in two respects. 1) we can work at finite temperature, which gives 
access to the full distribution of the number of matchings versus their size. 2) We study the limit 
of zero temperature (/3 — + oo) keeping the leading corrections of order 1/ (3 in the fields, see (|39l41ll : 
this allows to study the entropy of maximal matchings. 

The issue of RSB at zero temperature, which is present in the Z-0 approach, and absent in 
ours, is a somewhat subtle one. We shall just present a few arguments of explanation. 

First one should note that one does not expect ergodicity to be broken at any finite temperature 
in this problem. We have not tried to write a formal proof of this statement, but a first strong 
argument comes from the fact that the energy barriers are finite. Let us define as a step the 
fact of removing (adding) an edge from (to) a matching so that the new configuration is still a 
matching. By adding an edge to a matching we lower the energy by 2, whereas by removing an 
edge we increase the energy by 2. Using these steps one may go from any matching M to any other 
matching A/'. Furthermore, if \M'\ > |M|, one can choose the steps in such a way that at every 
step the energy is not higher than Em + 4. In other words energy barriers in the matching are at 
most 4. This argument suggests that there should not be ergodicity breaking at finite temperature, 
provided there are no diverging entropic barriers. Another indication of ergodicity comes from the 
rapid mixing results of the Monte Carlo procedure in the related problem of sampling perfect 
matchings in bipartite graphs j41j . 

Let us now focus on the zero temperature approach. The finite energy barriers between almost 
perfect matchings on the core become effectively infinite at zero temperature so the breaking of 
ergodicity cannot be excluded. The RS cavity method gives the equations ^'^2^ for the cavity fields, 
easily derived from The more subtle issue is the support of the distribution of cavity fields. 

Because a field h''^"' is defined as the difference (|34ll of the ground state energies conditioned to 
i being absent /present in the matching, it is clear that /i*^" G {—1, 0, 1}, and in ER graphs with 
c > e one should thus choose between the solutions (a) and (c) of eqs. H35|l - (ll-{7|l . If one considers 
the equations (|32|1 on a graph which is a tree, one finds that actually on all edges /li^a € {—l, !}■ 
Because ER graphs are locally tree-like (seen from a randomly chosen point, the subgraph of its 
environment up to a fixed distance d is a tree with probability one in the large N limit), it is 
tempting to restrict the cavity fields to ±1 values. This is what is done in Z-0. Then the RS 
solution, for any value of the average degree c, is necessarily solution (a). This solution is unstable 
towards IRSB at c > e, which forces one to study the IRSB solution in this regime, as was done 
in Z-0. 

The IRSB solution for the maximal matchings is able to nicely reconstruct the information 
that is contained in the h = fields of our RS solution with support on {—1, 0, 1} as follows: Let 
us consider an edge i — > a which should pass /i*^" = in our formalism. In the IRSB formalism 
it passes a message which is a probability distribution on the space of cavity fields, with support 
{— 1, 1}, of the form aSh.-i + {I — a)Sh.ij the distribution of a is related to our distribution A3 H44|l . 
Consequently, the complexity computed by Z-0 is equal to the complexity of the core. This means 
that different almost perfect matchings on the core form the different states, each state containing 
only one of them (similarly as in the XOR-SAT problem |42[ I43| . or in the multi-index matching 
|44|). It is interesting to notice that, through the restriction of cavity fields to h G {—1,1}, the 
Z-O method at the RS level completely neglects loops, the effect of which is recovered only at the 
IRSB level. Conversely, the inclusion of the value in our cavity fields allows to take into account 
loops directly, in which case RSB is not needed. 

This physical interpretation of the Z-0 IRSB solution is confirmed by its stability analysis: 
Using notations of one should compute the type I instability (interpreted as state aggregation) 
of the IRSB solution. Type II instability (interpreted as division of states) is irrelevant here, 
because each state corresponds to a single almost perfect matching on the core and cannot divide 
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further. We have found that the IRSB solution is stable, but only if one considers maximal 
matchings (y — > cxj in the Z-0 notation): any departure from this limit mixes the various 
configurations and restores ergodicity, as expected. 

6. Conclusion and discussion 

We have argued that the replica symmetric cavity solution is exact for counting matchings on 
random graphs. We have computed the size and quenched (typical) entropy in two random graph 
ensembles. For r-regular graphs we have shown that the quenched entropy of matchings of a given 
size agrees with the annealed one. For the Erdos-Renyi random graphs we have shown how our 
method reproduces result of Karp and Sipser for the size of maximum matching, and we computed 
the quenched entropy of matchings of a given size, fig. ^ 

Our method provides an algorithm for counting and uniform sampling of matchings on a given 
sparse graph, which should give the exact entropy for graphs with a girth that diverges in the large 
size limit. It would be very interesting to apply it to "real world" graphs, e.g. the internet, 
as in j4()| . Also its systematic study on graphs with smaller girth and comparison with existing 
approximative methods |22l 1281 1^ could reveal some interesting properties. 

There are two obvious generalizations of the matching problem where we expect that our 
method could be used straightforwardly. One is the matching with weights on edges (preferences 
to be matched) which is a dimer model on random graphs with quenched disorder. Another 
generalization is that instead of allowing a vertex to have none or one (k = 1) matched edge 
around itself, we could allow none or fc > 1 edges around a vertex to be matched. Then k = 2 
would mean we are interested in sets of loops, a model that has been studied recently in 001 BBj ■ 
The case k > 2, corresponding to k-regular subgraphs, is being studied by [3^1 ■ 

We hope that the replica symmetric nature of matching on random graphs should allow to 
turn all our result into rigorous theorems. In this respect there are two directions which look 
particularly promising. One is to generalize the local weak convergence approach of in order 
to turn our results into rigorous theorems when /? is small enough and/or c is far enough from e 
(for ER graphs) . The second one is to use Guerra's interpolation method ^1 E| 02] in order to 
turn our results into rigorous upper bounds for the entropy. More ambitiously, one can hope that 
the study of this matching problem will help to turn the cavity method into a rigorous tool. 
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